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SPLITTING AND SUCCESSIVELY SOLVING AUGMENTED 
LAGRANGIAN METHOD FOR OPTIMIZATION WITH 
SEMICONTINUOUS VARIABLES AND CARDINALITY 

CONSTRAINT* 

YANQIN BAlt, RENLI LIANG*, AND ZHOUWANG YANG§ 

Abstract. We propose a new splitting and successively solving augmented Lagrangian (SSAL) 
method for solving an optimization problem with both semicontinuous variables and a cardinal¬ 
ity constraint. This optimization problem arises in several contexts such as the portfolio selection 
problem, the compressed sensing problem and the unit commitment problem, etc. The problem is 
in general NP-hard. We derive an optimality condition for this optimization problem, under some 
suitable assumptions. By introducing an auxiliary variable and using an augmented Lagrangian func¬ 
tion, the constraints are decomposed into two parts. By fixing particular variables, the optimization 
problem is split into two subproblems. The first one is an easy solving convex programming problem, 
while the second one is a complicated quadratic programming problem with semicontinuous variables 
and cardinality constraint. The two subproblems are solved alternatively. The second subproblem is 
transformed equivalently into a mixed integer quadratic programming problem. Based on its features 
of both the objective function and the constraints, we solve the mixed integer quadratic program¬ 
ming problem successively, then obtain its exact optimal solution directly. Furthermore, we prove 
the convergence of SSAL, under some suitable assumptions. Finally, we implement our method for 
the portfolio selection problem and the compressed sensing problem, respectively. Real world data 
and simulation study show that SSAL outperforms the well-known CPLEX 12.6 and the penalty 
decomposition (PD) method, while enjoying a similar cardinality of decision variable. For example, 
the numerical results are even nearly 200 times faster than that of CPLEX 12.6 for the portfolio se¬ 
lection problem, and more than 40 times faster than PD method for the compressed sensing problem, 
respectively. In particularly, SSAL is powerful when the size of problem increase largely. 

Key words, semicontinuous variables and cardinality constraint optimization problem, alter¬ 
nating direction method of multipliers, augmented Lagrangian decomposition, 
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1. Introduction. In this paper we consider the following minimization problem: 

(P) min f{x) 

s.t. X € X, 

Ikllo < K, 

Xi € {0}U[ai,bi], i e {l,2,...,n}, 

where x G R", f{x): i?" —R is a continuously differentiable convex function and 
X G R” is a closed convex set with no empty interior, which represents the general 
constraints for x. ||a:||o denotes the cardinality (i.e., the number of nonzero entries) 
of the vector x. Furthermore 0 < K < n, 0 < Oi < A variable Xi G {0} U [a^, bi] is 
referred to as a semicontinuous variable. 

The problem (P) arises in several contexts such as the portfolio selection prob¬ 
lem (see [5], [13], [16]), the compressed sensing problem (see [4], [7]), the separable 
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quadratic facility location problem [24] and the unite commitment problem (see [15]), 
ect. Obviously, the constraints of (P) are both nonconvex and nonsmooth. Problem 
(P) is a highly nonlinear optimization problems and may have many local minimizers. 
So, it is difficult to solve problem (P), even if to find its local solutions. In general, 
problem (P) is NP hard [5]. For decades, several analytical and approximate meth¬ 
ods, including reformulation, approximation and relaxation, have been developed for 
solving problem (P). 

Padberg and Rinaldi [21] firstly proposed a branch-and-cut algorithm for the 
symmetric traveling salesman problem, which is an integer linear programming prob¬ 
lem. Frangioni and Gentile [15] developed the branch-and-cut algorithm for a general 
program with a convex objective function and the semicontinuous constraint, by us¬ 
ing a perspective function to derive lower bounds on the objective function value. 
Zheng, Sun and Li devolved the SDP approach by applying a special Lagrangian de¬ 
composition scheme for a quadratic programming problem with the cardinality and 
the semicontinuous constraints, which gives the tightest continuous relaxation of the 
perspective reformulation of the problem in [25]. The above exact methods are able 
to solve only a fraction of practically useful models, a variety of heuristic procedures 
have also been proposed. Chang et al. presented three heuristic algorithms based 
upon genetic algorithms, tabu search and simulated annealing for the standard mean 
variance portfolio optimisation model with cardinality constrained in [10]. There are 
also various approximate methods and techniques for dealing with the cardinality 
or sparse constraint. Burdakov et al presented a reformulation for the cardinality- 
constrained problems, including the semi-continuous case and they formulated the 
problems into a mixed-integer ones in literature [8]. Moreover, they also introduced 
a mixed-integer formulation whose standard relaxation still has the same solutions 
(in the sense of global minima) as the underlying cardinality-constrained problem 
[9]. Furthermore, they derived the suitable stationarity conditions and suggested an 
appropriate regularization method for the solution of optimization problems with car¬ 
dinality constraints. Bai et al proposed an alternating direction method of multipliers 
(ADMM) method for £i —£2 regularized logistic regression model in [2]. And Bai and 
Shen developed this method for the consensus proximal support vector machine for 
classification in [3]. 

Recently, Lu and Zhang [18] proposed a novel penalty decomposition (PD) method 
for an Ip norm minimization problem which minimizes a general nonlinear convex 
function subject to [[xUo < K. Besides the cardinality constraints, Cui et al. [13] 
investigated a portfolio selection problem with both the cardinality constraints and 
the semicontinuous constraints, which was reformulated as a mixed integer quadrat- 
ically constrained quadratic program (MIQCQP). They devolved the branch-and- 
bound method by reformulating MIQCQP to a new MIQCQP reformulation, which 
has a more tighter continuous relaxation bound than the MIQCQP. They reported 
the computational results for problems with up to 400 assets. Although the speed 
of the PD method is generally faster than the hard-thresholding algorithm [6], the 
solution obtained by the PD method is not better enough for the solution quality. 
Besides, the reformulation of Cui et al. [13] has more constraints and variables than 
MIQCQP and caused the difficulty that the branch-and-bound method is slow and 
fails to solve large-scale problems. 

Inspired by requirements of both large-scale problems and the better solution 
quality, we propose a new method for solving problem (P), which covers the portfolio 
selection problem and the compressed sensing problem as special cases. Under mild 
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assumptions,we derive the optimality conditions of problem (P). By introducing an 
auxiliary variable and using an augmented Lagrangian function, the constraints are 
separated into two parts. By fixing particular variables, the optimization problem is 
split into two subproblems. The first one is an easy to solve convex programming 
problem, while the second one is a complicated quadratic programming problem with 
semicontinuous variables and cardinality constraint. Therefore, the second subprob¬ 
lem is transformed into an equivalent mixed integer quadratic programming problem 
with 0 — 1 knapsack constraint. Then we propose the augmented Lagrangian alter¬ 
nating direction method and solve the two subproblems alternatively. The second one 
can be solved analytically, due to features of both the objective function and the con¬ 
straints. Furthermore, we prove that our method is convergent. Finally, we implement 
our method for the portfolio selection problem and the compressed sensing problem. 
The numerical results demonstrate that our method is faster than several existing 
methods. For example, in case of the compressed sensing problem, our method is 
more than 40 times faster than that of the penalty decomposition (PD) method in 
[ 18 ]. 

The paper is organized as follows. In Section 2, we describe some examples of 
problem (P) arising from different real-world applications. In Section 3, we establish 
the first order optimality conditions for the problem (P). In Section 4 we present 
our method for solving problem (P). We name it the Splitting and Successively solv¬ 
ing Augmented Lagrangian method (abbreviated as SSAL). We also establish some 
convergence results. In Section 5, we conduct numerical experiments to test the per¬ 
formance of SSAL for the portfolio selection problem in [13] and compressed sensing 
problems. Finally, we present some conclusions in Section 6. 

1.1. Notation. In this paper, the symbols i?" and i?" denote the n-dimensional 
Euclidean space and the nonnegative orthant of i?", respectively. For any vectors x = 
[xi,X 2 , ■. ■,XnY', y = [j/ 1 , 2 / 2 ,. ■. ,yn]'^, x-y= [xiyi,X 2 y 2 , ■ ■ .,Xnyn]'^, and x"^ =x-x. 
Given an index set L C {1,2,...,n} , |L| denotes the size of L, L is the complement 
ofLin{l,2,...,n}. xl denotes the subvector formed by the entries of x indexed 
by L. Likewise, denotes the submatrix formed by the columns of X indexed by 
L. We denote by I the identity matrix, whose dimension should be clear from the 
context. Given a closed set (7 C i?", Nc{x) and Tc{x) denote the normal and tangent 
cone of C at any x G C, respectively. For any real vector, || • ||o and || • || denote the 
cardinality (i.e., the number of nonzero entries) and the Euclidean norm of the vector, 
respectively. 

2. Examples of Applications. In this section, we describe some examples of 
problem (P) arising from the portfolio selection, the compressed sensing and the subset 
selection. More applications can be found in [24]. 

2.1. Portfolio selection problem. Let y and Q be the mean vector and the 
covariance matrix of n assets with return r = {ri,r 2 ■ ■ ■ Xn)'^, respectively. Let x = 
{xi,X 2 ■ ■ ■ Xn)"^ be the portfolio weights to the n risky assets. In real-world, most 
investors only choose a small number of stocks to invest. In other words, we need to 
consider the constraint ||a;||o < K to control the total number of different assets in the 
optimal portfolio. We also need to prevent the investors from holding some assets with 
very small amount due to the transaction cost and managerial concerns. So we need to 
consider the semicontinuous constraints: Xi S {0} U [a^, bi\,i G {1,2,..., n}. A mean- 
variance portfolio selection model with semicontinuous and cardinality constraints can 
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be modeled as: 


(SP) min x^Qx 

s.t. > po, 

e^x = 1 , 

Ikllo < K, 

cci e {0} U [ai,6i], i e {1, 2,... ,n}. 
where e = (1,1 • ■ • 1)^, po is the prescribed return level. 

2.2. Compressed sensing problem. Compressed sensing (CS) is an important 
problem in signal processing (see, for example, [11]).The CS problem with nonnega¬ 
tivity constraints can be formulated as 

(NCS) min -||Ax —6||^ 

s.t. Iloilo < K, cc > 0, 

where A G is a data matrix, b G Rp is an observation vector, K is an integer 

for controlling the sparsity of the solution. More applications and reformulations of 
problem (NCS) can be found in [20] and [17]. In the compressed sensing problem, it 
is often assumed that p < n. 

In multivariate linear regression, we are given p observed data points 
with Oi G i?" and bi G R. The goal is to minimize the least square measure of 
— bi)^ with only a subset of the prediction variables in x. This subset 
selection problem then has the same form as the problem (NCS). In contrast with 
the case of compressed sensing, the number of data in subset selection is often much 
larger than the dimension of the data (p > n). 

In practice, we can always impose an upper bound on x, i.e. x < u, for some 
sufficiently large positive numbers u (see [24]). 

(NCSB) min ^\\Ax - b\\^ 

s.t. j|a:|]o < K, 0 < x < u. 

3. The first order stationary conditions. In this section, we study the first 
order stationary conditions of the problem (P). 

Theorem 3.1. Assume that x* is a local minimizer of problem (P). Let J* = 
{1 < j < n : X* 0}, r* = j J*j. Suppose that the following Robinson condition 


(3.1) 

f XZ) 

d GTx{x*),Sa(x-) <0,tB(x-) <0 , 


1 V rfp J 

J 


holds, where 


A{x*) = {iGJ*: X* = b^}, B{x*) = {iGJ*: x* = a^}. 

Then there exists {ly* ,r]*, k*) G P" x P" x P" together with x* satisfying 

0 G yf{x*) + iy*-p* + K*+ Nx{x*), 
(3.2) ly* > 0, p* > 0, ly* {x* - b,) = 0, p* (-a^ -h a:*) = 0, i = l,...,n, 

K*=0, JGJ*. 
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Proof. The problem (P) can be reformulated to the following problem: 

(P) min f{x) 

s.t. X £ X, 

Qi < Xi < bi, i £ supp(x), 

Xi = 0, i £ supp{x), 

\supp{x)\ < K, 

where supp{x) = {l<i<n\xi^ 0}, and supp{x) is the complement of supp{x) in 
{1,2,..., n}. By the assumption that x* is a local minimizer of the problem (P), one 
can observe that x* is also a local minimizer of the following problem: 

(P) min f[x) 

s.t. X £ X, 

O'i ^ Xi ^ bi^ % £ J , 
a;, =0, j£r. 

We observe that the problem (P) can be reformulated equivalently as 

(PI) min f{x) 

s.t. X £ X, 

/J. {x-b)< 0, 

/J. (—a; + a) < 0, 

/J.a: = 0. 

Using theorem 3.25 of [22], when the Robinson condition (3.1) holds, there exists 
(i/*,77*, K*) G P” X P" X P" together with x* satisfying 

(3.3) 0£Vf{x*) + iy*-T]*+K*+Nx{x*), 

(3.4) r,*>0, ^:(x*-b,) = 0, r,U-a,+x*) = 0, i£r, 

(3.5) ^*=0, r,* = 0, i£j*, 

(3.6) K* = 0, j£ r. 

Note that (3.4) implies that if a:* ^0 and x* ^ bi , then v* = 0. If a:* ^ 0 and 
y* ^ Oi, then rj* = 0. The equation (3.5) implies that if a;* = 0 , then v* = 0 and 
rj* = 0. Combining (3.4) with (3.5), we obtain 

i^*{x*-b,) = 0, r,*(-a, + x*) = 0, *=l,2,...,n. 

It implies that the conclusion holds. □ 

4. SSAL algorithm. In this section, we propose a SSAL method for solving 
problem (P). Firstly, we derive the augmented Lagrangian decomposition formulation 
(P 2 ) of (P). Secondly, we discuss two subproblems obtained from fixing some variables 
respectively. Thirdly, we propose the method for solving problem (P) which minimizes 
(P 2 ) with respect to x, y in an alternating fashion while updating A in the iteration. 
Finally, we establish some convergence results for problem (P). 
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Introducing y = x, we can rewrite (P) as 


(Pi) min f{x) 

s.t. X € X, 
y = x, 

II2/II0 < K, 

G {0}U z=l,2,...,n. 

Then, we define the augmented Lagrangian function for (Pi) as follows: 

Lp{x, y, A) = f(x) +\^{y-x) + f^\\y - xf 

where A G i?" is the Lagrangian multiplier associated with the equality constraint 
y = X and p > 0 is a penalty parameter. The resulting augmented Lagrangian 
decomposition formulation is: 


(P 2 ) min Lp{x,y,X) 
s.t. X G X, 
y&Y, 


where 

(4.1) Y = {yG i?”| IIj/IIo <K,y,G {0} U [a„ b,]}. 

In the following part, we first discuss the subproblem obtained from fixing the 
variable y, and then we discuss the subproblem Py for fixing the variable x. 

4.1. The convex programming problem of variables x. We notice that for 
given A = A, when y = y gY is fixed, (P 2 ) becomes: 

(Px) min Lp{x,y,X) 

s.t. X G X. 

Note that AT is a convex set and the objective function of (P^) is a convex function 
of X. Thus, (Pa;) is a convex programming problem of variables x. 

4.2. Minimization of separable quadratic function. We consider the sub¬ 
problem of (P 2 ) for given A = A when a; = x G AT is fixed, problem (P 2 ) becomes 

(Py) min Lp{x,y,X) 

s.t. y GY. 

The object function L{x, y, A) is a quadratic function of variable y, so we can simplify 
the objective function L{x,y,X), and then transform the subproblem {Py) to the 
2—norm approximation problem with semi-continuous variables and the cardinality 
constraint. 


Lp{x,y,X) = f{x) + X^{y-x) + |||y-a;|p 

= f{x) - X^x + |||sf -f ^[||yf - 2{x - ^A)^y]. 
6 



Let w = X — ^A, the object function L{x,y,X) can be rewritten as: 


(4.2) 


Lp{x,y,X) = f{x) - X'^x + ^\\xf + |(||?/f - 2w^y) 

= f{x) - X^x + ^\\xf - |||w||^ + f lly “ 
= f{x)-+ ^\\y-'^f 


For given A, when a; = a; G X is fixed, ^||A|p, w = x — XX are constants. From (4.2), 
the minimization of the problem {Py) is equivalent to the following problem : 


(P 3 ) min y^^yi - Wjf 

i=l 

s.t. y &Y. 

Introducing a variable Zi G {0,1} to indicate the zero or nonzero status of each 
decision variable j/i, i = 1,.. .n. Problem (P 3 ) can be reformulated to the following 
mixed integer program (MIP), which can be solved by CPLEX 12.6. 


(MIP) min ^(yi-Wi)^ 

s.t. < K, ZiUi < yt < Zibi, 

ZiG{ 0 ,l}, i = l, 2 ,...,n. 

Let y be the optimal solution of the problem (Py), then there exists z together with 
y such that {y,z) is the optimal solution of the problem (P 3 ). Because the CPLEX 
12.6 ignores the feature of this problem (MIP), it is slow, especially for large-scale 
problems. So we propose new method to obtain the optimal solution (y, z) of (MIP). 

Note that the value of variable Zi only has two situations: either Zi = 0 or Zi = 1. 
Moreover Zi = 0 indicates yi = 0 and Zi = 1 indicates < yi < hi, i = 1, 2,..., n. 
We can get the optimal solution (y, z) of problem (MIP) successively. 

Firstly, taking into account that either Zi = 1 or Zi = 0, we introduce auxiliary 
variables qi and as follows: 

Qi = mm(y, - w*)^ = mm(y, - = wf, where yi = 0, 

zi=0 yi=0 


r,= mm(yi - Wi)'^ = min {yi-Wif 

Zi — 1 ai<yi<bi 

{ (a* - Wif if Wi < Qi, where iji = a^, 

0 if ai < Wi < bi, where yi = wt, 

{hi - Wif, if Wi > bi, where iji = bi. 

Secondly, we transform the problem (MIP) to the following problem, which is a 
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optimization problem over z : 


(4.3) 


min{^(yi - Wi)"^ |e’^z < K,z e {0,1}”, ZjOi < yi < Zih,i = 1,2 ,..., n} 

i=l 

n 

= min{^ nzi + (1 - Zi)q^ \ z < K,z £ {0,1}"} 

n 

= minjy^ Qi + (ri - qi)zi \ z <K,z€ {0,1}"} 

n n 

= J2w^ + min{^(r, - w‘f)z, \e^z<K,z£ {0,1}”} 

From (4.3), the optimal solution of problem (MIP) is obtained by solving 

n 

(ILP) min{y^(rj - wf)Zt \ z < K,z £ {0,1}”}. 

If 7^ = n, the cardinality constraint is void, hence we are left with the simple problem 

n 

min{y](ri - wf)z^ jz G {0,1}”}, 
whose optimal solution is given by: 

11 , if ri - w1 < 0 , 

•^2 — N 

I 0, otherwise. 

In order to deal with the general situation (i.e., if iG < n) we define v = r — . 

Without loss of generality we assume that v is sorted in ascending order: vi < V 2 < 
• • • ^ Vn- Then we have the following theorem. 

Theorem 4.1. The optimal solution z of problem (ILP) is given by 


Zi = 


1 , if f and Vi < 0 , 
0 , otherwise. 


Proof. It is obvious that the optimal value is either negative or zero. If the number 
of the negative entries of v is less than K, the optimal value is the sum of all negative 
entries of u, so the optimal solution is: 

I 1 , if Uj < 0 , 

Zi = < 

I 0 , otherwise. 

If the number of the negative entries of v is more than K, the optimal value is the 
sum of the k least entries of v, so the optimal solution becomes. 


— 


1, if i < K, 
0 , otherwise. 


□ 

After Theorem 4.1, the explicit solution of {Py) can be given as in the next 
theorem. 

Theorem 4.2. For given A = A and x = x G X, the optimal solution y of the 
problem (Py) is given by 

yi = Zi-Ta\n{bi,msiyi{wi, Oi}}, for i = l,2 ,...n, 

where z can be obtained from Theorem f.l. 

Yet we are ready to describe our algorithm, named Splitting and Successively 
solving Augmented Lagrangian method or SSAL method, as in Algorithm 1. 


Algorithm 1: SSAL method for solving the problem (P) 


1 

2 

3 

4 

5 


Input: Choose tolerance parameter e > 0, multiplier vector A°, penalty 

parameter p > 0 and the step-size w. Set y^ GY and the iteration 
counter k = 0. 

Output: An approximate optimal solution {x,z) of problem (P). 

Solve the subproblem problem (Pj,) with y = y° to obtain an optimal solution 
x° 

while ||a;^ — > e do 

update z^^^ and = argminj,gyLp(a:^, y, A^), according to 

Theorem 4.1 and Theorem 4.2 

update = argmin„,gj(-Pp(a;, y^+^, A^) 

update the multipliers by 


(4.4) 


A^i=A?+u;p[yf+i-x^i], 


i = 1, 2 ,..., n 


6 increase k by one and continue 

7 return x = x^ and z = z^ 


4.3. Convergence analysis. In this subsection, we show that any accumulation 
point of the sequence generated by SSAL method satisfies the first order stationary 
conditions. 

Theorem 4.3. Let {x,y,X) be any accumulation point o/(a;^, y^, A^) generated 
by algorithm 1, J = {t < j < n : Xj ^ tf\ and r = |J|. Assuming that {A^} is 
bounded, we have 

(4.5) hm [(x'=+\y'=+\A'=+i) - (x^y^A'=)] = 0 

k—^oo 

and the Robinson condition (3.1) holds at x for such J . Then, x satisfies the first 
order stationary conditions (3.2). 

The assumption on (4.5) and the boundedness of the multiplier vectors in The¬ 
orem 4.3 are the standard condition in the convergence analysis of the augmented 
Lagrangian methods for nonconvex optimization problems (see [19]). Similar condi¬ 
tions have been used in the convergence analysis of alternating direction methods of 
multipliers for nonconvex optimization problems (see [23] and [1]). 

Before proceeding to the proof of Theorem 4.3, we prove two lemmas. 
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Lemma 4.4. Let = {^ < j < n : ^ 0} and then 

there exists {v^,rf, k^) S i?" x i?" x i?" together with the and penalty 

parameter p satisfying: 

(4.6) 0 G V/(x'=+i) + u^-p’^ + k’^ + p{x^+^ - x’^) + Nx{x’^+^), 

(4.7) ^'f>0,?7f>0, i/f(yf+i-6,) = 0,?7f(yf+i-ai) = 0, i = l,2,...,n, 

(4.8) 

Proof. Using (4.3) and the same augment in Theorem 3.1, we have is also a 
local minimizer of the following problem: 

(P 4 ) min \\y-x^ + -X'^f 
P 

s.t. /Jfc+i (y - 6) < 0 
djfc+i {—y + a) < 0 
I'jk+iy = 0 . 

Now, this problem is a easy to solve convex optimization problem. And there 
exists together with y^'^^ satisfying: 

(4.9) 0 G p(/+^ - a:'= + -A'^) + + k\ 

P 

(4.10) v'f > 0, ? 7 f > 0, (yf+^ - h) = 0, yf (yf+^ - a,) = 0, i = 1,..., n, 

(4.11) K'f =0, j G 

Using the definition of x^^^ in algorithm 1 and the Theorem 3.34 of [22], we get 
the conclusion that x^^^ satisfies the following relationship: 

(4.12) 0 G V/(x'=+i) - - p(/+i - x'=+i) + Nx{x^^^). 

Combining (4.9) and (4.12), we obtain 

0 G p(y'=+i - + ^A'') + - 77 '= + 

P 

+ V/(x'=+^) - A'= - p{y'^+^ - x'^+i) + iVx(x'=+i), 

which implies that 

0 G V/(x'=+i) + + p(x'=+^ - x^) + Nx[x^^^). 

Combining this observation and (4.10), (4.11) , we see that the conclusion holds. □ 
The proof below uses that Oi ^ 0, bi ^ 0. In the compressed sensing problem, we 
can solving this issue by taking Ui > 0, but very small. In practice, this works. 

Lemma 4.5. There exists a suhsequenee K sueh that {(z^^, ry^, bounded. 

Proof. Since (x, y. A) is an accumulation point of (x^,y*^,A^), there exists a sub¬ 
sequence {(x^, y^, A^)}^g 75 - —>■ (x,y. A). It follows from (4.4) that yf^^ — xf^^ = 
(A^^^ — \^)/wp. Taking into account the assumption (4.5), we have 


(4.13) 


X = y. 
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Suppose for contradiction that it is unbounded. By passing to a subsequence if 
necessary, we can assume that At^)|| —)• oo. Let 

then IIK^)|| = 1. There exists a convergent subsequence K C K, such that 

77 ^, —>■ (z?, 77 , it), as k € K ^ 00 , KQK, 

Clearly, |(z/, 77 , k)|| = 1, it > 0 and 77 > 0 . Divide both sides by ||(z^^, 77 ^^, k'')|| in (4.6) 
and take limits as k G K ^ 00 . Noticing f{x) is a continuously differentiable convex 
fnnction and using the the semi continnity of Nx{-) (Lemma 2.42 of [ 22 ]), we obtain 

(4.14) —D + fj — RGNx{x) 

Dividing both sides by ||( 7 /^, 77 ^, k^)|| in (4.7) and (4.8), we have 

(4.15) pf(yf+i-5,)=0, 77f(yf+i-a,) = 0, 7 = l,2...,n, 

(4.16) r’; = o, 

Taking limits as fc € iL —?■ 00 in (4.15), using (4.13), we have 

(4.17) Di{xi-bi) = 0, fii{xi - Qi) = 0, 7 = l,2,...,n. 

By the dehnition of J, we have Xi = iji = 0, for i € J. Combining this with (4.17), 
ai ^ 0 and bi ^ 0, we obtain Di = 0, 77 ^ = 0. Moreover, we have ||i 7 || = \\Dj\\, 
ll^ll = ^ and 77 = Ijfjj. 

Now we show that for I £ J, Ri = 0. For I £ J, yi ^ 0. Note that yf -£ yi, there 
exists a sufficiently large k, such that yf 7 ^^ 0 for fc > fc. Combining this with (4.16), 
we have Iff = 0 for I £ J, k > k. Taking limits as k £ K ^ 00 , we obtain Ri = 0, for 
I £ J. Thus, we have ||k|| = ||k;j|| and R = IjRj. 

Since Robinson’s condition (3.1) is satisfied at a;, there exist 

(4.18) d£Tx{x), s£R^, i£R\ 
such that 

(4.19) SA(a) < 0, tB(ai) < 0, Ijd-s = -Dj, -Ijd-i=-fjj, ljd = -Rj. 

Using (4.14), (4.18), (4.19), ||P|| = ||ifj||, ||? 7 || = || 77 j||, ||k|| = ||kj||, D = IjDj, 
fj = Ijfjj and R = IjRj we have 

ll^^f+ ll^f+ ll«f = + II^J|P + IIMP 

= -[(-i^j)^i^J + i-Vjffij + i-R)^jRj] 

= + i-Ijd-ifvj + {iJdfRj] 

= d^{-Ijvj + Ijfjj - IjRj) + {s'^f)j + fvj) 

= d^{—D + fi — R) + + F’ 77 ) = + F ’?7 

In addition, it is following from (4.17) that if Xi ^ bi, then Pi = 0. Thus SiPi = 0. If 
Xi = bi, then Si < 0. combing this P > 0, we have SiPi < 0. Hence, s^P < 0. By a 
similar argument as above, one can show that P"fj < 0 . 
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Using these relations, we have 


+ ll^f + ll«f = + Ffj < 0. 

It yields || (i>, f], k)|| = 0, which contradicts \\{v, ry, it)|| = 1. Therefore, the subsequence 
is bounded. □ 

Proof of Theorem 4.3 

Proof. Taking into account that {{F, F, is bounded, then there exists a 

subsequence K C K such that {F,ri'^,F) —>■ (z),!), it) as k G K —>■ oo. Taking limits 
in (4.6),(4.7) and (4.8) as k G K ^ oo, using the relation x = y, the assumption (4.5) 
and the semicontinuity of Nxo{-), we get 

0 G V/(a;) + p — f + k + Nxix) 

Pz>0, r)i > 0, i>i{xi - bi) = 0, rii{-ai + Xi) = 0, i=l,2,...,n, 

ki = 0, i G J 

Hence, {k, fj, k) together with x satisfies the first order stationary conditions (3.2). 

□ 

5. Numerical results. In this section, we conduct numerical experiments to 
test the performance of SSAL proposed in Section 4. We select the portfolio selection 
problem (Pns) [13] and the compressed sensing problem (NCSB) as examples. We use 
real market data to construct the test problems (Pats). We also present simulation 
study of the performance of SSAL for the portfolio selection problem {Pns) [13] and 
the compressed sensing problem (NCSB). The experiments demonstrate that as far as 
the quality of approximate solutions is concerned, SSAL outperforms the PD method 
and is almost as good as the CPLEX and the MIQCQPi. At the same time, the 
computational time for SSAL is significantly smaller than the other three methods. 
The numerical tests are implemented in MATLAB 7.14 and run on a PC (2.10G Hz, 
2GB RAM). 

5.1. Portfolio selection problem with nonsystematic risk constraint. In 

this subsection, we verify the efficiency and stability of SSAL method in portfolio 
selection problem (Pjvs) of [13]: 

(Pns) min f{x) = x^{Q + D)x 
s.t. g{x) = x"'"Dx < ctq, 
oFX > pq, x = 1 , 

[kilo < K, 

cci G {0} U [ai,&i], i G {l,2,...,n}, 

where Q is a semi-definition matrix and D is a diagonal matrix. Clearly, problem 
(Pats) is in the form of (P), thus SSAL proposed in Section 4 can be suitably applied 
to solve (Pats)- The x—update step involves a quadratically constrained quadratic 
program (QCQP), which is 

(QCQP) min x'^ {q + D A-x - {\ + py)^ x 

s.t. g{x) = x'^Dx < (To, a^x > po, 

e^x = 1, 0 < X < &. 


12 


We use the CPLEXQCP solver in CPLEX 12.6 [12] with MATLAB interface to solve 
the (QCQP) problem. The y—update step has exactly the same expression as in 
the Theorem 4.2. It is obviously that the main computational effort of SSAL lies in 
solving the subproblem (QCQP). In our test, the parameters in SSAL method are 
set as follows: the step-size w = 0.3, the initial Lagrangian multiplier A° = 0 and 
the initial penalty p = 1- We set the prescribed weekly return level po = 2 x 10“^, 
the prescribed nonsystematic risk level tro = 1 x 10“^, at = 0.01, and bi = 0.3, 
i = 1,2,... ,n. The cardinality upper bound AT = 10, and the tolerance parameter is 
set as e = lO”'^. 


5.1.1. Real data set. In this subsection, we use the weekly return data of 
481 stocks from Standard&Poor’s 500 index between 2005 and 2010. We take linear 
regressions on 10 sector indexes of Standard&Poor’s 500 index to construct a sector 
factor model. The objective function of (Pats) is the covariance obtained by the sector 
factor model while the quadratic constraint controls the nonsystematic risk Dx in 
the sector factor model. We generate 5 instances for each test problem with the 
same size (n=50,100,150). For each instance, we choose n stocks from the 481 stocks 
randomly. 

In order to test the accuracy of solution obtained by SSAL method, we compare 
SSAL with the CPLEXMIQCP solver in CPLEX 12.6 [12]. Introducing a variable 
Zi € {0,1} to indicate the zero or nonzero status of each decision variable Xi, i = 
1 , 2 ,..., n, problem (Pns) can be reformulated as follows: 


(MIPo) min x"^{Q + D)x 
s.t. x"^Dx < ag, 

X > po, X = 1 , 

z < K, ZiQi < Xi < Zibi, 

Zi€ {0,1}, i = l,2,...,n. 

The experiments are conducted by using the default setting of CPLEX 12.6 with the 
maximum CPU time set equal to 3600 s. 

To illustrate the performance of SSAL for test problems (Pns) specifically, we 
present three comparison results in terms of the computational time in seconds, the 
number of iteration and the objective value explored by SSAL method and CPLEX 
12.6 in Table 1. We also record the relative difference (rel.dif) in Table 1, where 
re\.dil:={fval{xssAL) - fval (xcplex)) /fval{xcpLEx)- 

The numerical results in Table 5.1 show that the cost of computational efforts of 
SSAL is significantly less than the well-known solver CPLEX 12.6 for solving (Pats) 
on a real world data. For instance, the example listed the last row in the Table 1 
shows that CPU time of SSAL is almost 200 times faster than that of the CPLEX. 
Obviously, we conclude that the SSAL method outperforms the solver CPLEX 12.6. 
Note that the optimal objective value of the problem (Pns) solved by SSAL is almost 
same to that of the CPLEX. The largest absolute value of the relative difference 
(rel.dif) is less 4%. 

Then we continue to compare SSAL method with the methods provided by Cui 
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n 

SSAL 


CPLEX 

rel.difxlO ^ 

fvalxlO ^ 

cputime 

iter 

fvalxlO-'* 

cputime 

50 

2.3503 

0.3746 

7 

2.2981 

0.9093 

2.2725 

50 

2.7138 

0.1697 

5 

2.6584 

1.4307 

2.0845 

50 

3.7464 

0.2343 

5 

3.6557 

2.0024 

2.4788 

50 

3.5357 

0.6295 

16 

3.5446 

2.7720 

-0.0507 

50 

2.1145 

0.2257 

7 

2.0819 

1.3988 

1.5695 

100 

2.3550 

0.7521 

8 

2.3294 

9.8475 

1.0969 

100 

1.4998 

0.7698 

10 

1.5232 

15.8758 

-1.5421 

100 

4.3390 

0.9155 

10 

4.3077 

6.5231 

0.7267 

100 

2.0593 

1.0874 

14 

2.0380 

46.6908 

1.0468 

100 

1.6443 

1.3415 

15 

1.6683 

84.1186 

-1.4396 

150 

1.4061 

2.1155 

11 

1.3803 

168.8789 

1.8750 

150 

2.8602 

1.4629 

8 

2.8579 

34.4529 

0.0799 

150 

1.8396 

2.1173 

13 

1.8023 

312.4525 

2.0674 

150 

1.4054 

3.7521 

16 

1.3631 

497.5107 

3.1043 

150 

1.2593 

3.9929 

18 

1.2492 

732.9191 

0.8100 


Table 5.1: SSAL versus CPELX 12.6 on the (Pats) problem 


et al in [13], where a new model defined as MIQCQPi as follows: 

n 

(MIQCQPi) min /(x, [Q + D)x + ^ diSi 

2-1 

n 

s.t. g{x) = ^ di6i < (To, a^x > po, 

i=l 

e^x = 1, ||x||o <K, Xi< SiZi, 

e {0} U [oi, 6 i], * e {1, 2,... ,n}. 

Problem (MIQCQPi) is also solved by the mixed integer QCP solver in CPLEX 12.6 
with MATLAB interface using continuous relaxation for generating lower bounds. Cui 
et al claimed that their MIQCQPi is more efficient than the CPLEX. 

Now we generate the above test problems with n ranged from 100 to 400. For 
each n, we generate 20 instances by selecting n stocks from the 481 stocks randomly. 

We illustrate the numerical results by mean of Figure 5.1, comparing the average 
computational time of SSAl with that of (MIQCQPi). Figures 5.1 shows that SSAL 
also promises (MIQCQPi). For example, our method is 20 times faster than that of 
(MIQCQPi), under the best situation. 

5.1.2. Simulation data set. To further demonstrate the effective of perfor¬ 
mance of SSAL for large-scale problems, in this subsection, we simulate the test ex¬ 
amples with the large size from 500 to 1000. The simulation data is generated in the 
same fashion as in [26]. The parameters in the model (Pats) are generated randomly 
from the uniform distributions: ai S [0,0.03], jSij G [0.3,2]/m, i = l,2,...n, j = 
1 ,2,... m; (Tij, the covariance of factors i and j is calculated with sampled series from 
[0,0.4] randomly, i = 1,2,... ,m , j = 1,2,... ,m; a^i G [0, 0.002], i = 1, 2,... n. 

Our goal is to explore the tendency of the computational time as n increasing. 10 
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Fig. 5.1: Average CPU time (in seconds) for problems (Pats) of SSAL and 
[MIQCQPi) 


n 

Computating time (s) 


Iteration 


minimum 

maximum 

average 

minimum 

maximum 

average 

500 

21.0029 

48.8297 

29.82437 

8 

15 

12 

600 

36.9216 

57.6063 

47.9425 

8 

11 

10 

700 

68.4558 

103.8730 

85.6297 

9 

12 

11 

800 

126.9821 

160.2004 

140.9647 

8 

10 

9 

900 

173.8905 

253.4003 

212.5805 

8 

10 

10 

1000 

249.5224 

493.6785 

355.7434 

8 

14 

11 


Table 5.2: The variability for the large-scale problems 


instances are generated randomly and recorded the three indexes: minimum, maxi¬ 
mum and average numbers of iterations and computational time, respectively. It is 
importance to note that CPLEX 12.6 is not able to solve the most of instances within 
3600s. 

It is clear from Table 5.2 that SSAL is robust and efficient as the number of assets 
increases. In addition, the number of iteration less depends on the problem size. 

5.2. Compressed sensing and subset selection. In this subsection, we apply 
SSAL method proposed in Section 4 to solve the problem (NCSB). We generate data 
as follows: the data matrix A € Rp^'^ with orthonormal rows, each entry from a 
standard Gaussian distribution a^- ~ A/"(0,1). The original signal / is generated with 
K nonzero entries, each sampled is the absolute value of x, where x from an Af{0, 1) 
distribution. An observation vector b = Af + r G Rp, where r is Gaussian noise of 
variance cr^ = 0 . 01 . 

Firstly, we illustrate the behavior of SSAL visually. The cc-update step involves an 
unconstrained quadratic optimization problem P^. The y—update step has exactly 
the same expression as in the Theorem 4.2, where the semicontinuous bound = 
1X 10“®, z = 1, 2,..., n. We generate one instance with (p, n) = (1024, 2048), K = 500 
randomly. The parameters in SSAL are set as follows: the initial Lagrangian multiplier 
A° = 0, the tolerance parameter is set as e = 10“^, u = 10, the step-size oj = 1 and the 
initial penalty p = 1- We use notations ’’Original” and ’’SSAL” as the original signal. 
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Fig. 5.2: Compare the performance of SSAL method and the PD method to the 
problem (NCSB) 


the signal recovered by the SSAL respectively. From the bottom graph of Figure 5.2, 
it is clear that our method almost recover the original signal exactly. 

We also compare the performance of SSAL with the PD method of [18] for this 
simple test. For the PD method, we set the tolerance parameter eps = 10“® and the 
initial point as 0. To evaluate the quality of these sparse approximate solutions, we 
adopt a similar criterion as described in [14]. The associated squared error is defined 
as: 


MSB :=i||/-/||2, 

where / is an estimate of the original signal /. 

Comparing the top graph to the bottom one in Figure 5.2, we observe that SSAL 
is better recover the signal than the PD method, and the MSE of the approximate 
solutions obtained by SSAL are much lower than that of the PD method. 

In order to illustrate that SSAL is the winner, we compare SSAL with the PD 
method [18] with two different sizes. We construct the testing data and then set of all 
the parameters values in the same way as the previous test. For each size, we apply 
SSAL and the PD method to solve problem (NCSB) with four different values of K. 
For each such AT, we generate the data set consisting of 50 instances randomly. The 
first 50 problems with K = 50, then K = 100, K = 150 and K = 200. Figure 5.3 
and Figure 5.4 show the 10-logarithm of the ratio tpo/tssAL and the 10-logarithm 
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Fig. 5.3: The 10-logarithm of the ratio for {p,n) = (512,1024) 



Fig. 5.4: The 10-logarithm of the MSE ^sesVal “ (512,1024) 


of the ratio MSEpd/MSEssal for {p,n) = (512,1024), respectively. Figure 5.5 and 
Figure 5.6 show the 10-logarithm of the ratio ipp/tssAL and the 10-logarithm of the 
ratio MSEpd/MSEssal for (p,n) = (1024,2048), respectively. 

From Figure 5.3 and Figure 5.5, we observe that SSAL is substantially faster than 
the PD method for every case. Moreover, our method is 40 times faster than the PD 
method on average. For the best situation, our method is more than 100 times faster 
than the PD method. In addition, SSAL outperforms the PD method in terms of 
solution quality since it archives much smaller MSE values. Thus, SSAL much better 
recovers the original signal than the PD method. 

Empirical study demonstrates that SSAL is a powerful method to solve problem 
(P) for the following reasons: (1) SSAL can find the good enough approximate solution 
of problem (P). For example, the optimal objective value obtained by the SSAL and 
the CPLEX are almost the same for the problem (Pats). Furthermore, the quality of 
the approximate solution of problem (NCSB) obtained by SSAL is generally better 
than that of the PD method. (2) SSAL is a very fast method. For instance, SSAL is 
200 times faster than the CPLEX method and 5 times faster than the (MIQCQPi) 
method of [13] for the problem (Pats). Eor the (NCSB) problem, SSAL is 40 times 
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Fig. 5.5: The 10-logarithm of the ratio for (p, n) = (1024,2048) 



Fig. 5.6: The 10-logarithm of the MSE for {p, n) = (1024, 2048) 


faster than the PD method of [18]. 

6 . Conclusion. We have proposed a splitting and successively solving aug¬ 
mented Lagrangian (SSAL) method. Subproblems have been decomposed by hxing 
certain variables and solving them at each iteration alternatively. The convergence 
of SSAL has been proved, under some suitable assumptions. Real world data and 
simulation study show that SSAL outperforms the CPLEX 12.6, the (MIQCQPi) 
reformulation [13] and the PD method [18], while enjoying a similar cardinality of 
decision variable. In particularly, SSAL is powerful when the size of the problem 
increase largely. 
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